A common goal of privacy research is to release synthetic data that satisfiesa formal privacy guarantee and can be used by an analyst in place of theoriginal data. To achieve reasonable accuracy, a synthetic data set must betuned to support a specified set of queries accurately, sacrificing fidelityfor other queries. This work considers methods for producing synthetic data under differentialprivacy and investigates what makes a set of queries "easy" or "hard" toanswer. We consider answering sets of linear counting queries using the matrixmechanism, a recent differentially-private mechanism that can reduce error byadding complex correlated noise adapted to a specified workload. Our main result is a novel lower bound on the minimum total error required tosimultaneously release answers to a set of workload queries. The bound revealsthat the hardness of a query workload is related to the spectral properties ofthe workload when it is represented in matrix form. The bound is mostinformative for $(\epsilon,\delta)$-differential privacy but also applies to$\epsilon$-differential privacy.
展开▼